智能论文笔记

Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

Fei Yuan , Yinquan Lu , WenHao Zhu , Lingpeng Kong , Lei Li , Jingjing Xu

分类：自然语言处理 | 人工智能

2022-12-20

Traditional multilingual neural machine translation (MNMT) uses a single model to translate all directions. However, with the increasing scale of language pairs, simply using a single model for massive MNMT brings new challenges: parameter tension and large computations. In this paper, we revisit multi-way structures by assigning an individual branch for each language (group). Despite being a simple architecture, it is challenging to train de-centralized models due to the lack of constraints to align representations from all languages. We propose a localized training recipe to map different branches into a unified space, resulting in an efficient detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build the first large-scale open-source translation benchmark covering 7 language-centric data, each containing 445 language pairs. Experiments show that Lego-MT (1.2B) brings gains of more than 4 BLEU while outperforming M2M-100 (12B) (We will public all training data, models, and checkpoints)

translated by 谷歌翻译

WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning

Wenhao Wu , Wei Li , Xinyan Xiao , Jiachen Liu , Sujian Li , Yajuan Lv

分类：自然语言处理

2022-12-20

A crucial issue of current text generation models is that they often uncontrollably generate factually inconsistent text with respective of their inputs. Limited by the lack of annotated data, existing works in evaluating factual consistency directly transfer the reasoning ability of models trained on other data-rich upstream tasks like question answering (QA) and natural language inference (NLI) without any further adaptation. As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks. To alleviate this problem, we propose a weakly supervised framework that aggregates multiple resources to train a precise and efficient factual metric, namely WeCheck. WeCheck first utilizes a generative model to accurately label a real generated sample by aggregating its weak labels, which are inferred from multiple resources. Then, we train the target metric model with the weak supervision while taking noises into consideration. Comprehensive experiments on a variety of tasks demonstrate the strong performance of WeCheck, which achieves a 3.4\% absolute improvement over previous state-of-the-art methods on TRUE benchmark on average.

translated by 谷歌翻译

AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-realistic Style Transfer

Tianwei Lin , Honglin Lin , Fu Li , Dongliang He , Wenhao Wu , Meiling Wang , Xin Li , Yong Liu

分类：计算机视觉

2022-12-03

Photo-realistic style transfer aims at migrating the artistic style from an exemplar style image to a content image, producing a result image without spatial distortions or unrealistic artifacts. Impressive results have been achieved by recent deep models. However, deep neural network based methods are too expensive to run in real-time. Meanwhile, bilateral grid based methods are much faster but still contain artifacts like overexposure. In this work, we propose the \textbf{Adaptive ColorMLP (AdaCM)}, an effective and efficient framework for universal photo-realistic style transfer. First, we find the complex non-linear color mapping between input and target domain can be efficiently modeled by a small multi-layer perceptron (ColorMLP) model. Then, in \textbf{AdaCM}, we adopt a CNN encoder to adaptively predict all parameters for the ColorMLP conditioned on each input content and style image pair. Experimental results demonstrate that AdaCM can generate vivid and high-quality stylization results. Meanwhile, our AdaCM is ultrafast and can process a 4K resolution image in 6ms on one V100 GPU.

translated by 谷歌翻译

iEnhancer-ELM: Improve Enhancer Identification by Extracting Multi-scale Contextual Information based on Enhancer Language Models

Jiahao Li , Zhourun Wu , Wenhao Lin , Jiawei Luo , Jun Zhang , Qingcai Chen , Junjie Chen

分类：机器学习

2022-12-03

Motivation: Enhancers are important cis-regulatory elements that regulate a wide range of biological functions and enhance the transcription of target genes. Although many state-of-the-art computational methods have been proposed in order to efficiently identify enhancers, learning globally contextual features is still one of the challenges for computational methods. Regarding the similarities between biological sequences and natural language sentences, the novel BERT-based language techniques have been applied to extracting complex contextual features in various computational biology tasks such as protein function/structure prediction. To speed up the research on enhancer identification, it is urgent to construct a BERT-based enhancer language model. Results: In this paper, we propose a multi-scale enhancer identification method (iEnhancer-ELM) based on enhancer language models, which treat enhancer sequences as natural language sentences that are composed of k-mer nucleotides. iEnhancer-ELM can extract contextual information of multi-scale k-mers with positions from raw enhancer sequences. Benefiting from the complementary information of k-mers in multi-scale, we ensemble four iEnhancer-ELM models for improving enhancer identification. The benchmark comparisons show that our model outperforms state-of-the-art methods. By the interpretable attention mechanism, we finds 30 biological patterns, where 40% (12/30) are verified by a widely used motif tool (STREME) and a popular dataset (JASPAR), demonstrating our model has a potential ability to reveal the biological mechanism of enhancer. Availability: The source code are available at https://github.com/chen-bioinfo/iEnhancer-ELM Contact: junjiechen@hit.edu.cn and junjie.chen.hit@gmail.com; Supplementary information: Supplementary data are available at Bioinformatics online.

translated by 谷歌翻译

Class-based Quantization for Neural Networks

Wenhao Sun , Grace Li Zhang , Huaxi Gu , Bing Li , Ulf Schlichtmann

分类：机器学习

2022-11-27

In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC) operations. Accordingly, it is challenging to apply DNNs on resource-constrained platforms, e.g., mobile phones. Quantization is a method to reduce the size and the computational complexity of DNNs. Existing quantization methods either require hardware overhead to achieve a non-uniform quantization or focus on model-wise and layer-wise uniform quantization, which are not as fine-grained as filter-wise quantization. In this paper, we propose a class-based quantization method to determine the minimum number of quantization bits for each filter or neuron in DNNs individually. In the proposed method, the importance score of each filter or neuron with respect to the number of classes in the dataset is first evaluated. The larger the score is, the more important the filter or neuron is and thus the larger the number of quantization bits should be. Afterwards, a search algorithm is adopted to exploit the different importance of filters and neurons to determine the number of quantization bits of each filter or neuron. Experimental results demonstrate that the proposed method can maintain the inference accuracy with low bit-width quantization. Given the same number of quantization bits, the proposed method can also achieve a better inference accuracy than the existing methods.

translated by 谷歌翻译

SteppingNet: A Stepping Neural Network with Incremental Accuracy Enhancement

Wenhao Sun , Grace Li Zhang , Xunzhao Yin , Cheng Zhuo , Huaxi Gu , Bing Li , Ulf Schlichtmann

分类：机器学习

2022-11-27

Deep neural networks (DNNs) have successfully been applied in many fields in the past decades. However, the increasing number of multiply-and-accumulate (MAC) operations in DNNs prevents their application in resource-constrained and resource-varying platforms, e.g., mobile phones and autonomous vehicles. In such platforms, neural networks need to provide acceptable results quickly and the accuracy of the results should be able to be enhanced dynamically according to the computational resources available in the computing system. To address these challenges, we propose a design framework called SteppingNet. SteppingNet constructs a series of subnets whose accuracy is incrementally enhanced as more MAC operations become available. Therefore, this design allows a trade-off between accuracy and latency. In addition, the larger subnets in SteppingNet are built upon smaller subnets, so that the results of the latter can directly be reused in the former without recomputation. This property allows SteppingNet to decide on-the-fly whether to enhance the inference accuracy by executing further MAC operations. Experimental results demonstrate that SteppingNet provides an effective incremental accuracy improvement and its inference accuracy consistently outperforms the state-of-the-art work under the same limit of computational resources.

translated by 谷歌翻译

Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense

Zhecan Wang , Haoxuan You , Yicheng He , Wenhao Li , Kai-Wei Chang , Shih-Fu Chang

分类：计算机视觉 | 人工智能 | 自然语言处理

2022-11-10

Visual commonsense understanding requires Vision Language (VL) models to not only understand image and text but also cross-reference in-between to fully integrate and achieve comprehension of the visual scene described. Recently, various approaches have been developed and have achieved high performance on visual commonsense benchmarks. However, it is unclear whether the models really understand the visual scene and underlying commonsense knowledge due to limited evaluation data resources. To provide an in-depth analysis, we present a Multimodal Evaluation (ME) pipeline to automatically generate question-answer pairs to test models' understanding of the visual scene, text, and related knowledge. We then take a step further to show that training with the ME data boosts the model's performance in standard VCR evaluation. Lastly, our in-depth analysis and comparison reveal interesting findings: (1) semantically low-level information can assist the learning of high-level information but not the opposite; (2) visual information is generally under utilization compared with text.

translated by 谷歌翻译

SOTIF Entropy: Online SOTIF Risk Quantification and Mitigation for Autonomous Driving

Liang Peng , Boqi Li , Wenhao Yu , Kai Yang , Wenbo Shao , Hong Wang

分类：人工智能

2022-11-08

Autonomous driving confronts great challenges in complex traffic scenarios, where the risk of Safety of the Intended Functionality (SOTIF) can be triggered by the dynamic operational environment and system insufficiencies. The SOTIF risk is reflected not only intuitively in the collision risk with objects outside the autonomous vehicles (AVs), but also inherently in the performance limitation risk of the implemented algorithms themselves. How to minimize the SOTIF risk for autonomous driving is currently a critical, difficult, and unresolved issue. Therefore, this paper proposes the "Self-Surveillance and Self-Adaption System" as a systematic approach to online minimize the SOTIF risk, which aims to provide a systematic solution for monitoring, quantification, and mitigation of inherent and external risks. The core of this system is the risk monitoring of the implemented artificial intelligence algorithms within the AV. As a demonstration of the Self-Surveillance and Self-Adaption System, the risk monitoring of the perception algorithm, i.e., YOLOv5 is highlighted. Moreover, the inherent perception algorithm risk and external collision risk are jointly quantified via SOTIF entropy, which is then propagated downstream to the decision-making module and mitigated. Finally, several challenging scenarios are demonstrated, and the Hardware-in-the-Loop experiments are conducted to verify the efficiency and effectiveness of the system. The results demonstrate that the Self-Surveillance and Self-Adaption System enables dependable online monitoring, quantification, and mitigation of SOTIF risk in real-time critical traffic environments.

translated by 谷歌翻译

Versatile Real-Time Motion Synthesis via Kino-Dynamic MPC with Hybrid-Systems DDP

He Li , Tingnan Zhang , Wenhao Yu , Patrick M. Wensing

分类：机器人

2022-09-28

通常，通过解决轨迹优化问题并使用跟踪控制器来执行轨迹，通常在四足机器人上实现了专业运动。这种方法与通常通过在线重新计划控制常规步态的模型预测控制（MPC）策略平行。在这项工作中，我们提出了一种非线性MPC（NMPC）技术，该技术可以在统一框架内自然地重新计划专门运动技能和常规运动。 NMPC有关混合动力学模型的原因，并使用约束差分动态编程（DDP）求解器的变体来解决。拟议的NMPC使机器人能够发挥各种敏捷技能，例如跳跃，边界和小跑，以及这些技能之间的快速过渡。我们通过三个具有挑战性的运动序列评估了提出的算法，这些算法将多个敏捷技能结合在两个四倍的平台，即Unitree A1和MIT Mini Cheetah上，显示了其有效性和通用性。

translated by 谷歌翻译

Zero-Shot Retargeting of Learned Quadruped Locomotion Policies Using Hybrid Kinodynamic Model Predictive Control

He Li , Tingnan Zhang , Wenhao Yu , Patrick M. Wensing

分类：机器人

2022-09-28

强化学习（RL）见证了四足动物的大步进展，在可靠的SIM转移到现实的政策转移方面持续进展。但是，重用另一个机器人的政策仍然是一个挑战，这可以节省重新培训的时间。在这项工作中，我们提出了一个用于零射击政策重新定位的框架，其中可以在不同形状和尺寸的机器人之间转移多种运动技能。新框架以系统整合RL和模型预测控制（MPC）的计划和控制管道为中心。计划阶段采用RL来生成动态合理的轨迹以及联系时间表，避免了接触序列优化的组合复杂性。然后，将这些信息用于播种MPC，以通过新的混合运动动力学（HKD）模型稳定和鲁棒性地推出策略，该模型隐含地优化了立足点位置。硬件结果表明能够将政策从A1和Laikago机器人转移到MIT MIT MINI CHEETAH机器人，而无需重新调整政策。

translated by 谷歌翻译